65 research outputs found
X-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
Vision language pre-training aims to learn alignments between vision and
language from a large amount of data. We proposed multi-grained vision language
pre-training, a unified approach which can learn vision language alignments in
multiple granularity. This paper advances the proposed method by unifying image
and video encoding in one model and scaling up the model with large-scale data.
We present X-VLM, a pre-trained VLM with a modular architecture for both
image-text tasks and video-text tasks. Experiment results show that X-VLM
performs the best on base and large scale for both image-text and video-text
tasks, making a good trade-off between performance and model scale. Moreover,
we show that the modular design of X-VLM results in high transferability
for X-VLM to be utilized in any language or domain. For example, by simply
replacing the text encoder with XLM-R, X-VLM outperforms state-of-the-art
multilingual multi-modal pre-trained models without any multilingual
pre-training. The code and pre-trained models will be available at
github.com/zengyan-97/X2-VLM.Comment: 21 pages, 8 figure
A three-DOF ultrasonic motor using four piezoelectric ceramic plates in bonded-type structure
A three-DOF ultrasonic motor is presented in this paper. The proposed motor consists of four piezoelectric ceramic plates and a mental base with a flange that can fix the motor on a rack. The proposed motor takes advantage of a longitudinal mode and two bending modes, different hybrids of which can realize three-DOF actuation. Because of symmetric structure of the proposed motor, the resonance frequencies of the two bending modes are identical. And the resonance frequency of the longitudinal mode was tuned closed to the ones of the bending modes by adjusting the structural parameters in modal analysis. Then trajectories of nodes on the driving foot were obtained by the transient analysis to verify the feasibility of driving principle. Experiments including vibration shape test and output characteristic test were executed. The starting voltages of the rotation along horizontal axes are about 10 Vp-p. Under driving voltages of 200 Vp-p, the output velocities of three DOF can reach 280 rpm, 277 rpm and 327 rpm, respectively. The results of the experiments indicate that the proposed motor is characterized by low starting voltages and high output velocities
3D Super-Resolution Ultrasound with Adaptive Weight-Based Beamforming
Super-resolution ultrasound (SRUS) imaging through localising and tracking
sparse microbubbles has been shown to reveal microvascular structure and flow
beyond the wave diffraction limit. Most SRUS studies use standard delay and sum
(DAS) beamforming, where large main lobe and significant side lobes make
separation and localisation of densely distributed bubbles challenging,
particularly in 3D due to the typically small aperture of matrix array probes.
This study aims to improve 3D SRUS by implementing a low-cost 3D coherence
beamformer based on channel signal variance, as well as two other adaptive
weight-based coherence beamformers: nonlinear beamforming with p-th root
compression and coherence factor. The 3D coherence beamformers, together with
DAS, are compared in computer simulation, on a microflow phantom, and in vivo.
Simulation results demonstrate that the adaptive weight-based beamformers can
significantly narrow the main lobe and suppress the side lobes for modest
computational cost. Significantly improved 3D SR images of microflow phantom
and a rabbit kidney are obtained through the adaptive weight-based beamformers.
The proposed variance-based beamformer performs best in simulations and
experiments.Comment: Ultrasound localisation microscopy (ULM), super-resolution,
contrast-enhanced ultrasound, 3D beamformin
Ultrafast 3-D Super Resolution Ultrasound using Row-Column Array specific Coherence-based Beamforming and Rolling Acoustic Sub-aperture Processing: In Vitro, In Vivo and Clinical Study
The row-column addressed array is an emerging probe for ultrafast 3-D
ultrasound imaging. It achieves this with far fewer independent electronic
channels and a wider field of view than traditional 2-D matrix arrays, of the
same channel count, making it a good candidate for clinical translation.
However, the image quality of row-column arrays is generally poor, particularly
when investigating tissue. Ultrasound localisation microscopy allows for the
production of super-resolution images even when the initial image resolution is
not high. Unfortunately, the row-column probe can suffer from imaging artefacts
that can degrade the quality of super-resolution images as `secondary' lobes
from bright microbubbles can be mistaken as microbubble events, particularly
when operated using plane wave imaging. These false events move through the
image in a physiologically realistic way so can be challenging to remove via
tracking, leading to the production of 'false vessels'. Here, a new type of
rolling window image reconstruction procedure was developed, which integrated a
row-column array-specific coherence-based beamforming technique with acoustic
sub-aperture processing for the purposes of reducing `secondary' lobe
artefacts, noise and increasing the effective frame rate. Using an {\it{in
vitro}} cross tube, it was found that the procedure reduced the percentage of
`false' locations from 26\% to 15\% compared to traditional
orthogonal plane wave compounding. Additionally, it was found that the noise
could be reduced by 7 dB and that the effective frame rate could be
increased to over 4000 fps. Subsequently, {\it{in vivo}} ultrasound
localisation microscopy was used to produce images non-invasively of a rabbit
kidney and a human thyroid
Hot Water Pretreatment of Boreal Aspen Woodchips in a Pilot Scale Digester
Hot water extraction of aspen woodchips was treated at about 160 °C for 2 h with a liquor-to-solid ratio of 4.76:1 in a 1.84 m3 batch reactor with external liquor circulation. Both five-carbon and six-carbon sugars are obtained in the extraction liquor. Xylose and xylooligomers are the main five-carbon sugar in the hot water extract, which reached a maximum concentration of 0.016 mol/L, and 0.018 mol/L, respectively. Minor monosaccharides including galactose, mannose, rhamnose, glucose, and arabinose are also obtained during the hot water extraction. Rhamnose is the main six-carbon sugar in the extraction liquor, which has a maximum concentration of 0.0042 mol/L. The variations of acetyl groups and formic acid are investigated due to their catalytic effect on the extraction reactions. Zeroth-order kinetics models are found to be adequate in describing the dissolved solids, acids, xylose, and xylooligomers
Sonomyographic Prosthetic Interacion: Online Simultaneous and Proportional Control of Wrist and Hand Motions Using Semisupervised Learning
Human hands play a very important role in daily object manipulation. Current prosthetic hands are capable of mimicking most functions of the human hand, but how to interact with prosthetic hands based on human intentions remains an open problem. In this article, we proposed a wearable ultrasound-based interface to achieve simultaneous and proportional control(1) of wrist rotation (pronation/supination)(2) and hand grasp (open/close). A semisupervised learning framework integrating principal component analysis and sparse Gaussian process regression (SPGP) was proposed to simplify the cumbersome model calibration, which is a key issue that hinders the practical application of existing simultaneous and proportional prosthetic control approaches. The proposed algorithms were verified with both offline and online experiments on 12 able-bodied subjects. The offline analysis showed that the first principal component of ultrasound features (PC#1) was inherently linear to wrist rotations and the SPGP was able to establish the mapping between ultrasound features and hand grasp kinematics with fewer training data. The online target achievement control test showed that the proposed method can achieve accurate control of a virtual prosthesis, with motion completion rate of 97.61 +/- 4.67%, motion completion time of 4.66 +/- 0.91 s, and stability error of 10.99 +/- 1.69 degrees. This is the first study to achieve online simultaneous and proportional control of wrist and hand kinematics using ultrasound and semisupervised learning, paving the way for the era of muscle morphology-driven prosthetic control
- …